Search CORE

18 research outputs found

Using Minimal and Maximal Fault Tolerance for the Assessment of Fault-Tolerant Algorithms

Author: McMillin Bruce M.
Schollmeyer Martina
Publication venue: Scholars\u27 Mine
Publication date: 21/10/1992
Field of study

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Using Temporal Subsumption for Developing Efficient Error-Detecting Distributed Algorithms

Author: McMillin Bruce M.
Schollmeyer Martina
Publication venue: Scholars\u27 Mine
Publication date: 21/10/1993
Field of study

Distributed algorithms can use executable assertions derived from program verification to detect errors at run-time. However, a complete verification proof outline contains a large number of assertions, and embedding all of them into the program to be checked at run-time would make error-detection very inefficient. The technique of temporal subsumption examines the dependencies between the individual assertions along program execution paths. In contrast to classical subsumption, where all logical expressions to be examined are true simultaneously, an assertion need only be true when the corresponding statement in the distributed program has been executed. Thus, temporal subsumption based on the set of assertions derived from a verification proof and in combination with the set of all legal states in the system, allows for the removal of (partial) assertions along execution sequences. We assume a fault model of Byzantine (malicious) behavior, and therefore an individual process cannot check itself for faults. We assume that a non-faulty process will always perform the correct computation so that once external data (obtained through communication) has been verified, the local computation does not need to be checked. A non-faulty process can thus detect faults produced by a faulty process based on the information it receives from it

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Formal Generation of Executable Assertions for Application-Oriented Fault Tolerance

Author: Lutfiyya Hanan
McMillin Bruce M.
Schollmeyer Martina
Publication venue: Scholars\u27 Mine
Publication date: 05/08/1992
Field of study

Executable assertions embedded into a distributed computing system can provide run-time assurance by ensuring that the program state, in the actual run-time environment, is consistent with the logical stage specified in the assertions; if not, then an error has occurred and a reliable communication of this diagnostic information is provided to the system such that reconfiguration and recovery can take place. Application- oriented fault tolerance is a method that provides fault detection using executable assertions based on the natural constraints of the application. This paper focuses on giving application-oriented fault tolerance a theoretical foundation by providing a mathematical model for the generation of executable assertions which detect faults in the presence of arbitrary failures. The mathematical model of choice was axiomatic program verification. A method was developed that translates a concurrent verification proof outline into an error-detecting concurrent program. This paper shows the application of the developed method to several applications

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Using temporal subsumption to generate efficient error-detecting distributed algorithms

Author: Schollmeyer Martina
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1994
Field of study

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Noise generators for use in computer simulation

Author: Schollmeyer Martina
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1989
Field of study

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

A General Method for Maximizing the Error-Detecting Ability of Distributed Algorithms

Author: Bruce McMillin
Martina Schollmeyer
Publication venue: Springer-Verlag
Publication date: 23/06/1993
Field of study

The bound on component failures and their spatial distribution govern the fault tolerance of any candidate error-detecting algorithm. For distributed memory multiprocessors, the specific algorithm and the topology of the processor interconnection network define these bounds. This paper introduces the maximal fault index, derived from the system topology and local communication patterns, to demonstrate how a maximal number of simultaneous (Byzantine) component failures can be tolerated for a particular interconnection network and error-detecting algorithm. The index is used to design a fault-tolerant mapping of processes to processor groups such that the error-detecting ability of the algorithm is preserved for certain multiple simultaneous processor failures. 1 This work was supported in part by the National Science Foundation under Grant Numbers MSS9216479 and CDA-9222827, and, in part, from the Air Force Office of Scientific Research under contract numbers F49620-92-J-0546 and F4962..

CiteSeerX

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Efficient Run-Time Assurance in Distributed Systems Through Selection of Executable Assertions

Author: Bruce McMillin
Martina Schollmeyer
Publication venue
Publication date
Field of study

Run-time assurance of a distributed system can be obtained by comparing, at run-time, the behavior of the program with the expected behavior described in the program's specification. Executable assertions, embedded into the program code, can determine when there are discrepancies, due to processor failures, between actual and expected behavior. Thus, there is no global monitoring scheme but processes will check each other. A non-faulty process will always perform correct computation. It can detect errors in other processes after receiving information from them and checking it against expected values by using executable assertions. In order to efficiently check programs at run-time, we need to determine how many assertions need to be used, where they need to be located, and what they need to check to ensure that all occurring errors can be detected. This paper introduces temporal subsumption to remove, from a given set of assertions for a specific program, the assertions which perform r..

CiteSeerX

Using Temporal Subsumption for Developing Efficient Error-Detecting Distributed Algorithms

Author: Bruce Mcmillin
Martina Schollmeyer
Publication venue
Publication date
Field of study

CiteSeerX

Efficient Run-time Assurance in Distributed Systems through Selection of Executable Assertions

Author: McMillin Bruce M.
Schollmeyer Martina
Publication venue: 'Elsevier BV'
Publication date: 01/05/2000
Field of study

Run-time assurance of a distributed system can be obtained by comparing, at run-time, the actual behavior of a program with the expected behavior described in the program\u27s specification. Executable assertions, embedded into the program code, can determine when there are discrepancies between actual and expected behavior. There is no global monitoring scheme and error-detection will occur at the process level. We can assume that a non-faulty process will always perform correct computations. It can detect errors in other processes after receiving information from them and checking it against expected values using executable assertions. in order to efficiently check programs at run-time, we need to determine how many assertions need to be used, where they need to be located, and what they need to check to ensure that all occurring errors can be detected. This paper introduces temporal subsumption to remove, from a given set of assertions for a specific distributed program, the assertions which perform redundant checking. the remaining set of assertions is then the set necessary to provide run-time assurance. to subsume assertions, the flow graphs of the individual components of the distributed system are examined using a graph traversal algorithm. Temporal subsumption is a pre-processing step that creates a smaller set of assertions to be embedded into the program and to be checked at run-time. This makes error-detection at run-time less time-consuming and thus more efficient since redundant checking is avoided

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine